219 research outputs found
Estimating the resolution limit of the map equation in community detection
A community detection algorithm is considered to have a resolution limit if
the scale of the smallest modules that can be resolved depends on the size of
the analyzed subnetwork. The resolution limit is known to prevent some
community detection algorithms from accurately identifying the modular
structure of a network. In fact, any global objective function for measuring
the quality of a two-level assignment of nodes into modules must have some sort
of resolution limit or an external resolution parameter. However, it is yet
unknown how the resolution limit affects the so-called map equation, which is
known to be an efficient objective function for community detection. We derive
an analytical estimate and conclude that the resolution limit of the map
equation is set by the total number of links between modules instead of the
total number of links in the full network as for modularity. This mechanism
makes the resolution limit much less restrictive for the map equation than for
modularity, and in practice orders of magnitudes smaller. Furthermore, we argue
that the effect of the resolution limit often results from shoehorning
multi-level modular structures into two-level descriptions. As we show, the
hierarchical map equation effectively eliminates the resolution limit for
networks with nested multi-level modular structures.Comment: 12 pages, 7 figure
Stock portfolio structure of individual investors infers future trading behavior
Although the understanding of and motivation behind individual trading
behavior is an important puzzle in finance, little is known about the
connection between an investor's portfolio structure and her trading behavior
in practice. In this paper, we investigate the relation between what stocks
investors hold, and what stocks they buy, and show that investors with similar
portfolio structures to a great extent trade in a similar way. With data from
the central register of shareholdings in Sweden, we model the market in a
similarity network, by considering investors as nodes, connected with links
representing portfolio similarity. From the network, we find groups of
investors that not only identify different investment strategies, but also
represent groups of individual investors trading in a similar way. These
findings suggest that the stock portfolios of investors hold meaningful
information, which could be used to earn a better understanding of stock market
dynamics.Comment: 9 pages, 4 figures, 1 tabl
Modeling sequences and temporal networks with dynamic community structures
In evolving complex systems such as air traffic and social organizations,
collective effects emerge from their many components' dynamic interactions.
While the dynamic interactions can be represented by temporal networks with
nodes and links that change over time, they remain highly complex. It is
therefore often necessary to use methods that extract the temporal networks'
large-scale dynamic community structure. However, such methods are subject to
overfitting or suffer from effects of arbitrary, a priori imposed timescales,
which should instead be extracted from data. Here we simultaneously address
both problems and develop a principled data-driven method that determines
relevant timescales and identifies patterns of dynamics that take place on
networks as well as shape the networks themselves. We base our method on an
arbitrary-order Markov chain model with community structure, and develop a
nonparametric Bayesian inference framework that identifies the simplest such
model that can explain temporal interaction data.Comment: 15 Pages, 6 figures, 2 table
Understanding Complex Systems: From Networks to Optimal Higher-Order Models
To better understand the structure and function of complex systems,
researchers often represent direct interactions between components in complex
systems with networks, assuming that indirect influence between distant
components can be modelled by paths. Such network models assume that actual
paths are memoryless. That is, the way a path continues as it passes through a
node does not depend on where it came from. Recent studies of data on actual
paths in complex systems question this assumption and instead indicate that
memory in paths does have considerable impact on central methods in network
science. A growing research community working with so-called higher-order
network models addresses this issue, seeking to take advantage of information
that conventional network representations disregard. Here we summarise the
progress in this area and outline remaining challenges calling for more
research.Comment: 8 pages, 4 figure
An information-theoretic framework for resolving community structure in complex networks
To understand the structure of a large-scale biological, social, or
technological network, it can be helpful to decompose the network into smaller
subunits or modules. In this article, we develop an information-theoretic
foundation for the concept of modularity in networks. We identify the modules
of which the network is composed by finding an optimal compression of its
topology, capitalizing on regularities in its structure. We explain the
advantages of this approach and illustrate them by partitioning a number of
real-world and model networks.Comment: 5 pages, 4 figure
Constrained information flows in temporal networks reveal intermittent communities
Many real-world networks represent dynamic systems with interactions that
change over time, often in uncoordinated ways and at irregular intervals. For
example, university students connect in intermittent groups that repeatedly
form and dissolve based on multiple factors, including their lectures,
interests, and friends. Such dynamic systems can be represented as multilayer
networks where each layer represents a snapshot of the temporal network. In
this representation, it is crucial that the links between layers accurately
capture real dependencies between those layers. Often, however, these
dependencies are unknown. Therefore, current methods connect layers based on
simplistic assumptions that do not capture node-level layer dependencies. For
example, connecting every node to itself in other layers with the same weight
can wipe out dependencies between intermittent groups, making it difficult or
even impossible to identify them. In this paper, we present a principled
approach to estimating node-level layer dependencies based on the network
structure within each layer. We implement our node-level coupling method in the
community detection framework Infomap and demonstrate its performance compared
to current methods on synthetic and real temporal networks. We show that our
approach more effectively constrains information inside multilayer communities
so that Infomap can better recover planted groups in multilayer benchmark
networks that represent multiple modes with different groups and better
identify intermittent communities in real temporal contact networks. These
results suggest that node-level layer coupling can improve the modeling of
information spreading in temporal networks and better capture intermittent
community structure.Comment: 10 pages, 10 figures, published in PR
Robustness of journal rankings by network flows with different amounts of memory
As the number of scientific journals has multiplied, journal rankings have
become increasingly important for scientific decisions. From submissions and
subscriptions to grants and hirings, researchers, policy makers, and funding
agencies make important decisions with influence from journal rankings such as
the ISI journal impact factor. Typically, the rankings are derived from the
citation network between a selection of journals and unavoidably depend on this
selection. However, little is known about how robust rankings are to the
selection of included journals. Here we compare the robustness of three journal
rankings based on network flows induced on citation networks. They model
pathways of researchers navigating scholarly literature, stepping between
journals and remembering their previous steps to different degree: zero-step
memory as impact factor, one-step memory as Eigenfactor, and two-step memory,
corresponding to zero-, first-, and second-order Markov models of citation flow
between journals. We conclude that higher-order Markov models perform better
and are more robust to the selection of journals. Whereas our analysis
indicates that higher-order models perform better, the performance gain for the
second-order Markov model comes at the cost of requiring more citation data
over a longer time period.Comment: 9 pages, 5 figure
- …